# Medical Image Analysis

Medgemma 4b It Bf16
Other
MedGemma-4B-IT is a vision-language model specialized in the medical field, developed by Google and now converted to MLX format for efficient operation on Apple chips.
Image-to-Text Transformers
M
mlx-community
131
1
Medgemma 4b It Q8 0 GGUF
Other
MedGemma-4B-it-Q8_0-GGUF is a GGUF format model converted from google/medgemma-4b-it, specifically designed for image-to-text tasks in the medical field.
Image-to-Text Transformers
M
NikolayKozloff
142
2
Computer Vision Project
Apache-2.0
This model is fine-tuned based on the DINOv2 architecture for disease classification of skin lesion images
Image Classification English
C
Kar1hik
164
1
Aggregate Segmentation
MIT
PyTorch-based DeepLabV3Plus image segmentation model supporting efficient semantic segmentation tasks
Image Segmentation
A
Matiullah2401592
78
1
Gemma 3 4b It Abliterated Q4 0 GGUF
This model is a GGUF format conversion of mlabonne/gemma-3-4b-it-abliterated, combined with the visual component of x-ray_alpha for a smoother multimodal experience.
Image-to-Text
G
BernTheCreator
160
1
AKI 4B Phi 3.5 Mini
AKI is a multimodal foundation model that achieves cross-modal mutual attention (MMA) by unlocking the causal attention mechanism in LLMs, addressing vision-language misalignment without additional parameters or training time.
Image-to-Text English
A
Sony
25
27
Vit Chest Xray
MIT
A fine-tuned model based on Vision Transformer (ViT) architecture for classifying chest X-rays, trained on the CheXpert dataset.
Image Classification Transformers English
V
codewithdark
316
1
Mammoscreen
Apache-2.0
An ensemble model for predicting breast cancer and breast density from screening mammograms, using 3 CNN networks with different resolutions for inference
Image Classification Transformers
M
ianpan
76
1
Erax VL 7B V2.0 Preview
Apache-2.0
EraX-VL-7B-V2.0-Preview is a powerful multimodal model designed for OCR and visual question answering, excelling in processing multiple languages including Vietnamese, with outstanding performance in recognizing medical forms, invoices, and other documents.
Image-to-Text Transformers Supports Multiple Languages
E
erax-ai
476
22
Genmedclip
MIT
GenMedClip is a zero-shot image classification model based on the open_clip library, specializing in medical image analysis.
Image Classification
G
wisdomik
40
0
Fpn Tu Resnet18
MIT
A PyTorch-implemented FPN image segmentation model that supports various encoder architectures, suitable for semantic segmentation tasks.
Image Segmentation Safetensors
F
smp-test-models
217
0
Linknet Tu Resnet18
MIT
Linknet is a PyTorch-implemented image segmentation model suitable for semantic segmentation tasks.
Image Segmentation
L
smp-test-models
214
0
Vit Base Brain Mri
Apache-2.0
An image classification model fine-tuned on the BrainMRI dataset based on Google's ViT base model
Image Classification Transformers
V
andrei-teodor
42
1
Bio Medical MultiModal Llama 3 8B V1
Other
A multimodal biomedical model fine-tuned based on Llama-3-8B-Instruct, supporting text and image processing, suitable for biomedical research and clinical applications.
Image-to-Text Transformers
B
ContactDoctor
1,440
122
Florence 2 FT Lung Cancer Detection
A lung cancer detection model fine-tuned based on Florence-2-base-ft, identifying lung cancer types through lung images
Text-to-Image Transformers English
F
nirusanan
20
1
Xray Model
MIT
This model is used for bone age prediction, built upon the YassinHegazy/xray-model base model.
Image Classification
X
YassinHegazy
29
0
Virchow
Apache-2.0
Virchow is a self-supervised vision Transformer pretrained on 1.5 million whole-slide histopathology images, serving as a slide-level feature extractor for computational pathology downstream tasks.
Image Classification
V
paige-ai
5,121
57
M3D CLIP
Apache-2.0
M3D-CLIP is a CLIP model specifically designed for 3D medical imaging, achieving visual and language alignment through contrastive loss.
Multimodal Alignment Transformers
M
GoodBaiBai88
2,962
9
Interpret Cxr Impression Baseline
This model can convert medical images (such as X-rays) into descriptive text to assist in medical diagnosis.
Image-to-Text Transformers
I
IAMJB
17
0
Pneumonia Model
A deep learning model based on ViT architecture for identifying pneumonia symptoms in chest X-ray images
Image Classification Transformers
P
Borjamg
25
1
Skin Types Image Detection
Apache-2.0
A facial image classification model using Vision Transformer (ViT) architecture for detecting dry, normal, and oily skin types
Image Classification Transformers
S
dima806
776
11
Dinov2 Base Xray 224
Apache-2.0
The AIMI Foundation Model Suite is a collection of foundation models for the radiology domain developed by the Stanford AIMI team, focusing on medical image analysis tasks.
Image Classification Transformers
D
StanfordAIMI
32.11k
2
Dinov2 Base Finetuned SkinDisease
Apache-2.0
A skin disease classification model fine-tuned based on the DINOv2 base model, achieving 95.57% accuracy on the ISIC 2018+Atlas Dermatology dataset.
Image Classification Transformers
D
Jayanth2002
1,584
3
Breast Cancer SAM V1
Apache-2.0
Breast cancer segmentation model based on Segment Anything Model (SAM), used for tumor region identification in medical imaging
Image Segmentation Transformers Supports Multiple Languages
B
ayoubkirouane
162
11
Segformer For Optic Disc Cup Segmentation
Apache-2.0
A retinal fundus image segmentation model based on the SegFormer architecture, specifically designed for precise segmentation of the optic disc and cup.
Image Segmentation Transformers
S
pamixsun
2,592
5
Skinsam
Apache-2.0
SkinSAM is a skin lesion segmentation model based on a 12-layer ViT-b architecture, fine-tuned on ISIC and PH2 datasets, focusing on precise segmentation of skin lesion images.
Image Segmentation Transformers Supports Multiple Languages
S
ahishamm
71
1
Segformer B0 Finetuned Teeth Segmentation
Other
A dental X-ray image segmentation model fine-tuned based on the MIT-B0 architecture, specifically designed for precise segmentation of tooth regions in dental imaging
Image Segmentation Transformers
S
vimassaru
55
1
Pubmed Clip Vit Base Patch32
MIT
PubMedCLIP is a version of the CLIP model fine-tuned for the medical field, specifically designed to handle medical images and related text.
Text-to-Image English
P
flaviagiammarino
10.27k
19
Efficientnet ParkinsonsPred
MIT
A Parkinson's disease prediction model based on the EfficientNet architecture, achieving approximately 83% accuracy by analyzing patient drawings
Image Classification Transformers Other
E
dhhd255
17
2
Vit Base Patch16 224 Chest X Ray
Apache-2.0
This model is a fine-tuned version of Google's ViT-base model on a chest X-ray classification dataset, designed for medical image analysis.
Image Classification Transformers
V
chanelcolgate
229
1
Histo Train
Apache-2.0
An image classification model fine-tuned based on google/vit-base-patch16-224, suitable for histology image analysis tasks.
Image Classification Transformers
H
tcvrishank
36
0
Clipmd
ClipMD is a medical image-text matching model developed based on OpenAI's CLIP model, employing a sliding window text encoder specifically designed for medical image classification tasks.
Image-to-Text Transformers English
C
Idan0405
165
8
Vit Pneumonia
Apache-2.0
A pneumonia detection model based on the ViT architecture, fine-tuned on a chest X-ray classification dataset with an accuracy rate of 97.68%
Image Classification Transformers
V
trpakov
23
0
Detr Resnet 50 CD45RB 1000 Att
Apache-2.0
A fine-tuned model based on facebook/detr-resnet-50 for object detection tasks
Object Detection Transformers
D
polejowska
13
0
Vit Mlo 512 Birads
An image classification model based on the Vision Transformer architecture, fine-tuned for BIRADS classification tasks
Image Classification Transformers
V
mm-ai
37
0
Resnet 50 Finetuned Brain Tumor
Apache-2.0
A brain tumor image classification model fine-tuned based on microsoft/resnet-50, achieving an accuracy of 91.71% on the evaluation set
Image Classification Transformers
R
Alia-Mohammed
472
0
Vit Diabetic Retinopathy Classification
Apache-2.0
A diabetic retinopathy classification model based on the Vision Transformer (ViT) architecture, achieving 72.87% accuracy on the evaluation set
Image Classification Transformers
V
Kontawat
197
3
Swin Tiny Patch4 Window7 224 Finetuned Skin Cancer
Apache-2.0
A fine-tuned model based on the Swin Transformer architecture, specifically designed for skin cancer image classification tasks
Image Classification Transformers
S
MPSTME
18
0
Swin Tiny Patch4 Window7 224 Finetuned Skin Cancer
Apache-2.0
A fine-tuned model based on Swin Transformer architecture, specifically designed for skin cancer image classification tasks
Image Classification Transformers
S
dhairyakapadia
17
0
Vit Base Patch16 224 Finetuned Chest
Apache-2.0
An image classification model fine-tuned on chest image datasets based on Google's ViT model, achieving 99% accuracy
Image Classification Transformers
V
adielsa
37
0
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
© 2025AIbase